AITopics | Egypt

SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection

Neural Information Processing SystemsJun-2-2025, 05:32:59 GMT

Synthetic Aperture Radar (SAR) object detection has gained significant attention recently due to its irreplaceable all-weather imaging capabilities. However, this research field suffers from both limited public datasets (mostly comprising <2K images with only mono-category objects) and inaccessible source code. To tackle these challenges, we establish a new benchmark dataset and an open-source method for large-scale SAR object detection. Our dataset, SARDet-100K, is a result of intense surveying, collecting, and standardizing 10 existing SAR detection datasets, providing a large-scale and diverse dataset for research purposes. To the best of our knowledge, SARDet-100K is the first COCO-level large-scale multi-class SAR object detection dataset ever created. With this high-quality dataset, we conducted comprehensive experiments and uncovered a crucial challenge in SAR object detection: the substantial disparities between the pretraining on RGB datasets and finetuning on SAR datasets in terms of both data domain and model structure. To bridge these gaps, we propose a novel Multi-Stage with Filter Augmentation (MSFA) pretraining framework that tackles the problems from the perspective of data input, domain transition, and model migration. The proposed MSFA method significantly enhances the performance of SAR object detection models while demonstrating exellent generalizability and flexibility across diverse models. This work aims to pave the way for further advancements in SAR object detection.

artificial intelligence, image understanding, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Asia > China (0.14)
Africa > Middle East > Egypt (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry:

Energy (0.49)
Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.93)

Add feedback

Appendix of SynRS3D: A Synthetic Dataset for Global 3D Semantic Understanding from Monocular Remote Sensing Imagery

Neural Information Processing SystemsJun-2-2025, 00:47:48 GMT

In this technical supplement, we provide detailed insights and additional results to support our main paper. Section A.1 outlines the generation process of the SynRS3D dataset, including the tools and plugins used. It also covers the licenses for these plugins. Section A.3 elaborates on the evaluation metrics for different tasks, including the proposed F Section A.4 describes the experimental setup and the selection of hyperparameters for the RS3DAda method. Section A.5 presents the ablation study results and analysis for the RS3DAda method. Section A.6 provides supplementary experimental results combining SynRS3D and real data scenarios, complementing Section 5.2 of the main paper. Section A.9 highlights the performance of models trained on the SynRS3D dataset using RS3DAda in the critical application of disaster mapping in remote sensing. A.1 Detailed Generation Workflow of SynRS3D The generation workflow of SynRS3D involves several key steps, from initializing sensor and sunlight parameters to generating the layout, geometry, and textures of the scene. This comprehensive process ensures that the generated SynRS3D mimics real-world remote sensing scenarios with high fidelity. The main steps of the workflow are as follows: Initialization: Set up the sensor and sunlight parameters using uniform and normal distributions to simulate various conditions. Layout Generation: Define the grid and terrain parameters to create diverse urban and natural environments. Texture Generation: Use advanced models like GPT-4 [1] and Stable Diffusion [18] to generate realistic textures for different categories of land cover.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

Asia > China (0.46)
North America > United States > Hawaii (0.14)
Asia > Middle East > Republic of Türkiye (0.14)
(2 more...)

Genre:

Workflow (1.00)
Research Report > New Finding (0.46)

Industry:

Law (1.00)
Information Technology (0.95)
Government (0.93)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.92)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

SynRS3D: A Synthetic Dataset for Global 3D Semantic Understanding from Monocular Remote Sensing Imagery

Neural Information Processing SystemsJun-2-2025, 00:47:46 GMT

Global semantic 3D understanding from single-view high-resolution remote sensing (RS) imagery is crucial for Earth observation (EO). However, this task faces significant challenges due to the high costs of annotations and data collection, as well as geographically restricted data availability. To address these challenges, synthetic data offer a promising solution by being unrestricted and automatically annotatable, thus enabling the provision of large and diverse datasets. We develop a specialized synthetic data generation pipeline for EO and introduce SynRS3D, the largest synthetic RS dataset. SynRS3D comprises 69,667 high-resolution optical images that cover six different city styles worldwide and feature eight land cover types, precise height information, and building change masks. To further enhance its utility, we develop a novel multi-task unsupervised domain adaptation (UDA) method, RS3DAda, coupled with our synthetic dataset, which facilitates the RS-specific transition from synthetic to real scenarios for land cover mapping and height estimation tasks, ultimately enabling global monocular 3D semantic understanding based on synthetic data. Extensive experiments on various real-world datasets demonstrate the adaptability and effectiveness of our synthetic dataset and the proposed RS3DAda method. SynRS3D and related codes are available at https://github.com/JTRNEO/SynRS3D.

artificial intelligence, machine learning, natural language, (13 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
Asia > China (0.46)
North America > United States > Hawaii (0.14)
(3 more...)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry:

Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.74)
Information Technology > Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

Perception of Knowledge Boundary for Large Language Models through Semi-open-ended Question Answering

Neural Information Processing SystemsMay-31-2025, 18:17:57 GMT

Large Language Models (LLMs) are widely used for knowledge-seeking purposes yet suffer from hallucinations. The knowledge boundary of an LLM limits its factual understanding, beyond which it may begin to hallucinate. Investigating the perception of LLMs' knowledge boundary is crucial for detecting hallucinations and LLMs' reliable generation. Current studies perceive LLMs' knowledge boundary on questions with concrete answers (close-ended questions) while paying limited attention to semi-open-ended questions that correspond to many potential answers. Some researchers achieve it by judging whether the question is answerable or not. However, this paradigm is not so suitable for semi-open-ended questions, which are usually "partially answerable questions" containing both answerable answers and ambiguous (unanswerable) answers.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Oceania (1.00)
Asia > China (0.68)
North America > Canada > Ontario (0.14)
(18 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Consumer Health (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

Appendix A Related Work A.1 Multimodal Large Language Models 3 A.2 Trustworthiness of LLMs

Neural Information Processing SystemsMay-29-2025, 15:08:37 GMT

A.1 Multimodal Large Language Models Building on the foundational capabilities of groundbreaking Large Language Models (LLMs) such as GPT [3], PALM [6], Mistral [49], and LLama [108], which excel in language understanding and reasoning, recent innovations have integrated these models with other modalities (especially vision), leading to the development of Multimodal Large Language Models (MLLMs). These advanced MLLMs combine and process visual and textual data, demonstrating enhanced versatility in addressing both traditional vision tasks [21, 40, 42, 133] and complex multimodal challenges [34, 70, 136]. Among all MLLMs, proprietary models consistently perform well. OpenAI's GPT-4-Vision [82] pioneered this space by adeptly handling both text and image content. Anthropic's Claude 3 series [7] integrates advanced vision capabilities and multilingual support, enhancing its application across diverse cognitive and real-time tasks.

large language model, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.45)
Asia > China (0.28)
Europe > Italy (0.27)
Africa > Middle East > Egypt (0.14)

Genre:

Research Report (1.00)
Overview (0.92)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Government (0.93)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection

Neural Information Processing SystemsMay-25-2025, 20:03:23 GMT

Synthetic Aperture Radar (SAR) object detection has gained significant attention recently due to its irreplaceable all-weather imaging capabilities. However, this research field suffers from both limited public datasets (mostly comprising <2K images with only mono-category objects) and inaccessible source code. To tackle these challenges, we establish a new benchmark dataset and an open-source method for large-scale SAR object detection. Our dataset, SARDet-100K, is a result of intense surveying, collecting, and standardizing 10 existing SAR detection datasets, providing a large-scale and diverse dataset for research purposes. To the best of our knowledge, SARDet-100K is the first COCO-level large-scale multi-class SAR object detection dataset ever created. With this high-quality dataset, we conducted comprehensive experiments and uncovered a crucial challenge in SAR object detection: the substantial disparities between the pretraining on RGB datasets and finetuning on SAR datasets in terms of both data domain and model structure. To bridge these gaps, we propose a novel Multi-Stage with Filter Augmentation (MSFA) pretraining framework that tackles the problems from the perspective of data input, domain transition, and model migration. The proposed MSFA method significantly enhances the performance of SAR object detection models while demonstrating exellent generalizability and flexibility across diverse models. This work aims to pave the way for further advancements in SAR object detection.

artificial intelligence, image understanding, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Asia > China (0.14)
Africa > Middle East > Egypt (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry:

Energy (0.49)
Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.93)

Add feedback

ac662d74829e4407ce1d126477f4a03a-Paper-Conference.pdf

Neural Information Processing SystemsMay-25-2025, 08:29:10 GMT

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

Africa > Middle East > Egypt (0.14)
North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry:

Government (1.00)
Banking & Finance > Economy (1.00)
Energy (0.68)
Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.91)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.90)

Add feedback

a054ff49751dbc991ec30ae479397c3d-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsMay-25-2025, 07:01:58 GMT

information retrieval, large language model, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
Asia > China (0.93)
Asia > South Korea (0.68)
(4 more...)

Industry:

Energy (1.00)
Education (0.93)
Leisure & Entertainment > Sports > Tennis (0.93)
(4 more...)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(3 more...)

Add feedback

Appendix A Related Work A.1 Multimodal Large Language Models 3 A.2 Trustworthiness of LLMs

Neural Information Processing SystemsMay-24-2025, 14:32:12 GMT

A.1 Multimodal Large Language Models Building on the foundational capabilities of groundbreaking Large Language Models (LLMs) such as GPT [3], PALM [6], Mistral [49], and LLama [108], which excel in language understanding and reasoning, recent innovations have integrated these models with other modalities (especially vision), leading to the development of Multimodal Large Language Models (MLLMs). These advanced MLLMs combine and process visual and textual data, demonstrating enhanced versatility in addressing both traditional vision tasks [21, 40, 42, 133] and complex multimodal challenges [34, 70, 136]. Among all MLLMs, proprietary models consistently perform well. OpenAI's GPT-4-Vision [82] pioneered this space by adeptly handling both text and image content. Anthropic's Claude 3 series [7] integrates advanced vision capabilities and multilingual support, enhancing its application across diverse cognitive and real-time tasks.

large language model, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.45)
Asia > China (0.28)
Europe > Italy (0.27)
Africa > Middle East > Egypt (0.14)

Genre:

Research Report (1.00)
Overview (0.92)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Government (0.93)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Appendix of SynRS3D: A Synthetic Dataset for Global 3D Semantic Understanding from Monocular Remote Sensing Imagery

Neural Information Processing SystemsMar-27-2025, 09:47:37 GMT

In this technical supplement, we provide detailed insights and additional results to support our main paper. Section A.1 outlines the generation process of the SynRS3D dataset, including the tools and plugins used. It also covers the licenses for these plugins. Section A.3 elaborates on the evaluation metrics for different tasks, including the proposed F Section A.4 describes the experimental setup and the selection of hyperparameters for the RS3DAda method. Section A.5 presents the ablation study results and analysis for the RS3DAda method. Section A.6 provides supplementary experimental results combining SynRS3D and real data scenarios, complementing Section 5.2 of the main paper. Section A.9 highlights the performance of models trained on the SynRS3D dataset using RS3DAda in the critical application of disaster mapping in remote sensing. A.1 Detailed Generation Workflow of SynRS3D The generation workflow of SynRS3D involves several key steps, from initializing sensor and sunlight parameters to generating the layout, geometry, and textures of the scene. This comprehensive process ensures that the generated SynRS3D mimics real-world remote sensing scenarios with high fidelity. The main steps of the workflow are as follows: Initialization: Set up the sensor and sunlight parameters using uniform and normal distributions to simulate various conditions. Layout Generation: Define the grid and terrain parameters to create diverse urban and natural environments. Texture Generation: Use advanced models like GPT-4 [1] and Stable Diffusion [18] to generate realistic textures for different categories of land cover.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

Asia > China (0.46)
North America > United States > Hawaii (0.14)
Asia > Middle East > Republic of Türkiye (0.14)
(2 more...)

Genre:

Workflow (1.00)
Research Report > New Finding (0.46)

Industry:

Law (1.00)
Information Technology (0.95)
Government (0.93)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.92)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Filters

Collaborating Authors

Egypt

SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection

Appendix of SynRS3D: A Synthetic Dataset for Global 3D Semantic Understanding from Monocular Remote Sensing Imagery

SynRS3D: A Synthetic Dataset for Global 3D Semantic Understanding from Monocular Remote Sensing Imagery

Perception of Knowledge Boundary for Large Language Models through Semi-open-ended Question Answering

Appendix A Related Work A.1 Multimodal Large Language Models 3 A.2 Trustworthiness of LLMs

SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection

ac662d74829e4407ce1d126477f4a03a-Paper-Conference.pdf

a054ff49751dbc991ec30ae479397c3d-Paper-Datasets_and_Benchmarks.pdf

Appendix A Related Work A.1 Multimodal Large Language Models 3 A.2 Trustworthiness of LLMs

Appendix of SynRS3D: A Synthetic Dataset for Global 3D Semantic Understanding from Monocular Remote Sensing Imagery